Response-Hiding Searchable Encryption

ABSTRACT

A method for providing response-hiding searchable encryption includes receiving a search query for a keyword from a user device associated with a user. The keyword appears in one or more encrypted documents within a corpus of encrypted documents stored on an untrusted storage device. The method also includes accessing a document oblivious key-value storage (OKVS) to obtain a list of document identifiers associated with the keyword. Each document identifier in the list of document identifiers associated with a respective keyword identifier is concatenated with the keyword and uniquely identifies a respective one of the one or more encrypted documents that the keyword appears in. The method also includes returning the list of document identifiers obtained from the document OKVS to the user device.

CROSS REFERENCE TO RELATED APPLICATIONS

This U.S. patent application claims priority under 35 U.S.C. § 119(e) toU.S. Provisional Application 62/838,111, filed on Apr. 24, 2019, whichis hereby incorporated by reference in its entirety.

TECHNICAL FIELD

This disclosure relates to response-hiding searchable encryption.

BACKGROUND

Searchable encryption (i.e., encrypted search) has garnered significantattention for the past many years. Increasingly, a user or client owns alarge corpus of encrypted documents that are stored at a server notunder the client's control (i.e., the server is untrusted). Withsearchable encryption, the client can store his/her encrypted documentson the untrusted server, but still maintain the capability of searchingthe documents and, for example, retrieve identifiers of all documentscontaining a specific keyword. However, currently available methods areeither extremely computationally expensive (e.g., fully homomorphicencryption) such that they become prohibitive on sufficiently large setsof documents or sacrifice a portion of the security and privacy of thedocuments.

SUMMARY

One aspect of the disclosure provides a method for providingresponse-hiding searchable encryption. The method includes receiving, atdata processing hardware, a search query for a keyword from a userdevice associated with a user. The keyword appears in one or moreencrypted documents within a corpus of encrypted documents stored on anuntrusted storage device. The method also includes accessing, by thedata processing hardware, a document oblivious key-value storage (OKVS)to obtain a list of document identifiers associated with the keyword.Each document identifier in the list of document identifiers isassociated with a respective keyword identifier concatenated with thekeyword and uniquely identifies a respective one of the one or moreencrypted documents that the keyword appears in. The method alsoincludes returning, by the data processing hardware, the list ofdocument identifiers obtained from the document OKVS to the user device.

Implementations of the disclosure may include one or more of thefollowing optional features. In some implementations, the method furtherincludes receiving, at the data processing hardware from the userdevice, a read request including one or more of the document identifiersfrom the returned list of document identifiers. For each documentidentifier received in the read request, the method includes retrieving,by the data processing hardware, the respective one of the one or moreencrypted documents that the keyword appears in from the untrustedstorage device and returning, by the data processing hardware, theretrieved respective one of the one or more encrypted documents that thekeyword appears in to the user device. The user device is configured todecrypt the retrieved respective one of the one or more encrypteddocuments.

The method, in some examples, further includes, for a new encrypteddocument uploaded by the user into the corpus of encrypted documentsstored on the untrusted storage device, receiving, at the dataprocessing hardware from the user device, a set of keywords associatedwith the new encrypted document and a new document identifier uniquelyidentifying the new encrypted document. The method may also includedetermining, by the data processing hardware, whether the new documentidentifier exists in an identifier OKVS, the identifier OKVS including aset of document identifiers. Each document identifier in the set ofdocument identifiers uniquely identifies a respective one of theencrypted documents within the corpus of encrypted documents stored onthe untrusted storage device. When the new document identifier does notexist in the identifier OKVS, the method may include updating, by thedata processing hardware, the identifier OKVS with the new documentidentifier uniquely identifying the new encrypted document. For eachkeyword in the set of keywords associated with the new encrypteddocument, the method optionally includes incrementing, by the dataprocessing hardware, a keyword count associated with the keyword in acounts OKVS. The counts OKVS includes a plurality of keyword counts.Each keyword count indicates a number of the encrypted documents withinthe corpus of encrypted documents that a respective keyword appears in.The method may also include inserting, by the data processing hardware,a concatenation of the keyword and a respective keyword identifierassociated with the new document identifier into the document OKVS

The method, where incrementing the keyword count associated with thekeyword in the counts OKVS, may include, when the keyword count isgreater than or equal to one, increasing the keyword count by one. Themethod may also include, when the keyword count is not greater than orequal to one, setting the keyword count to one. In some implementations,the method further includes, when the new document identifier exists inthe identifier OKVS, discarding, by the data processing hardware, thenew document identifier and the set of keywords associated with the newencrypted document.

In some examples, where the search query for the keyword received fromthe user device includes a query count, the query count specifies anumber of document identifiers to obtain from the document OKVS.Accessing the document OKVS to obtain the list of document identifiersmay include limiting a number of the document identifiers included inthe list of document identifiers to the number specified by the querycount. When the number of document identifiers included in the list ofdocument identifiers obtained from the documents OKVS is less than thenumber specified by the query count, the method, in someimplementations, includes appending, by the data processing hardware,one or more dummy document identifiers to the list of documentidentifiers for return to the user device. Optionally, the respectivekeyword identifier associated with each document identifier in the listof document identifiers obtained from the document OKVS includes aunique numerical indicator indicating a creation date of the documentidentifier relative to creation dates of the other document identifiersin the list of document identifiers.

In some implementations, the method further includes, prior to accessingthe document OKVS, accessing, by the data processing hardware, a countsOKVS to determine a number of the one or more encrypted documents thekeyword appears. The counts OKVS includes a plurality of keyword counts.Each keyword count indicates a number of the encrypted documents withinthe corpus of encrypted documents that a respective keyword appears in.The method further includes, in some examples, accessing, by the dataprocessing hardware, a deletion OKVS to identify one or more documentidentifiers associated with a deletion of the keyword. Each identifieddocument identifier is concatenated with the keyword and uniquelyidentifies a respective one of the one or more encrypted documents inwhich the keyword is deleted. The list of document identifiers obtainedfrom the document OKVS may exclude any of the one or more documentidentifiers identified from the deletion OKVS.

In some implementations, the deletion OKVS includes a set of keywordsconcatenated with document identifiers. Each keyword in the set ofkeywords concatenated with a respective document identifier uniquelyidentifies a respective encrypted document within the corpus ofencrypted documents in which the keyword appears in or has been deletedfrom. In some examples, the method further includes, for an updatedencrypted document uploaded by the user into the corpus of encrypteddocuments stored on the untrusted storage device, receiving, at the dataprocessing hardware from the user device, a set of keywords associatedwith the updated encrypted document and a document identifier uniquelyidentifying the updated encrypted document. For each keyword in the setof keywords associated with the updated encrypted document, the methodmay include incrementing, by the data processing hardware, a keywordcount associated with the keyword in a counts OKVS. The counts OKVSincludes a plurality of keyword counts. Each keyword count indicates anumber of the encrypted documents within the corpus of encrypteddocuments that a respective keyword appears in. The method may alsoinclude inserting, by the data processing hardware, a concatenation ofthe keyword and a respective keyword identifier associated with thedocument identifier into the document OKVS and updating, by the dataprocessing hardware, a deletion status of the associated concatenationin the deletion OKVS to indicate that the keyword is not deleted fromthe associated encrypted document.

The method optionally further include, for an existing encrypteddocument in the corpus of encrypted documents stored on the untrustedstorage device, receiving, at the data processing hardware from the userdevice, a deletion request including set of keywords to be deleted fromthe existing encrypted document and a document identifier uniquelyidentifying the existing encrypted document. For each keyword in the setof keywords to be deleted from the existing encrypted document, themethod includes updating, by the data processing hardware, a deletionstatus associated with the keyword concatenated with the respectivedocument identifier in the deletion OKVS to indicate that the keyword isdeleted from the existing encrypted document uniquely identified by therespective document identifier.

Another aspect of the disclosure provides a system for providingresponse-hiding searchable encryption. The system includes dataprocessing hardware and memory hardware in communication with the dataprocessing hardware. The memory hardware stores instructions that whenexecuted on the data processing hardware cause the data processinghardware to perform operations. The operations include receiving asearch query for a keyword from a user device associated with a user.The keyword appears in one or more encrypted documents within a corpusof encrypted documents stored on an untrusted storage device. Theoperations also include accessing a document oblivious key-value storage(OKVS) to obtain a list of document identifiers associated with thekeyword. Each document identifier in the list of document identifiers isassociated with a respective keyword identifier concatenated with thekeyword and uniquely identifies a respective one of the one or moreencrypted documents that the keyword appears in. The operations alsoinclude returning the list of document identifiers obtained from thedocument OKVS to the user device.

This aspect may include one or more of the following optional features.In some implementations, the operations further include receiving, fromthe user device, a read request including one or more of the documentidentifiers from the returned list of document identifiers. For eachdocument identifier received in the read request, the operations includeretrieving the respective one of the one or more encrypted documentsthat the keyword appears in from the untrusted storage device andreturning the retrieved respective one of the one or more encrypteddocuments that the keyword appears in to the user device. The userdevice is configured to decrypt the retrieved respective one of the oneor more encrypted documents.

The operations, in some examples, further include, for a new encrypteddocument uploaded by the user into the corpus of encrypted documentsstored on the untrusted storage device, receiving, from the user device,a set of keywords associated with the new encrypted document and a newdocument identifier uniquely identifying the new encrypted document. Theoperations may also include determining whether the new documentidentifier exists in an identifier OKVS, the identifier OKVS including aset of document identifiers. Each document identifier in the set ofdocument identifiers uniquely identifies a respective one of theencrypted documents within the corpus of encrypted documents stored onthe untrusted storage device. When the new document identifier does notexist in the identifier OKVS, the operations may include updating theidentifier OKVS with the new document identifier uniquely identifyingthe new encrypted document. For each keyword in the set of keywordsassociated with the new encrypted document, the operations optionallyinclude incrementing a keyword count associated with the keyword in acounts OKVS. The counts OKVS includes a plurality of keyword counts.Each keyword count indicates a number of the encrypted documents withinthe corpus of encrypted documents that a respective keyword appears in.The operations may also include inserting a concatenation of the keywordand a respective keyword identifier associated with the new documentidentifier into the document OKVS.

The operations, where incrementing the keyword count associated with thekeyword in the counts OKVS, may include, when the keyword count isgreater than or equal to one, increasing the keyword count by one. Theoperations may also include, when the keyword count is not greater thanor equal to one, setting the keyword count to one. In someimplementations, the operations further include, when the new documentidentifier exists in the identifier OKVS, discarding the new documentidentifier and the set of keywords associated with the new encrypteddocument.

In some examples, where the search query for the keyword received fromthe user device includes a query count, the query count specifies anumber of document identifiers to obtain from the document OKVS.Accessing the document OKVS to obtain the list of document identifiersmay include limiting a number of the document identifiers included inthe list of document identifiers to the number specified by the querycount. When the number of document identifiers included in the list ofdocument identifiers obtained from the documents OKVS is less than thenumber specified by the query count, the operations, in someimplementations, include appending one or more dummy documentidentifiers to the list of document identifiers for return to the userdevice. Optionally, the respective keyword identifier associated witheach document identifier in the list of document identifiers obtainedfrom the document OKVS includes a unique numerical indicator indicatinga creation date of the document identifier relative to creation dates ofthe other document identifiers in the list of document identifiers.

In some implementations, the operations further include, prior toaccessing the document OKVS, accessing a counts OKVS to determine anumber of the one or more encrypted documents the keyword appears. Thecounts OKVS includes a plurality of keyword counts. Each keyword countindicates a number of the encrypted documents within the corpus ofencrypted documents that a respective keyword appears in. The operationsfurther include, in some examples, accessing a deletion OKVS to identifyone or more document identifiers associated with a deletion of thekeyword. Each identified document identifier is concatenated with thekeyword and uniquely identifies a respective one of the one or moreencrypted documents in which the keyword is deleted. The list ofdocument identifiers obtained from the document OKVS may exclude any ofthe one or more document identifiers identified from the deletion OKVS.

In some implementations, the deletion OKVS includes a set of keywordsconcatenated with document identifiers. Each keyword in the set ofkeywords concatenated with a respective document identifier uniquelyidentifies a respective encrypted document within the corpus ofencrypted documents in which the keyword appears in or has been deletedfrom. In some examples, the operations further include, for an updatedencrypted document uploaded by the user into the corpus of encrypteddocuments stored on the untrusted storage device, receiving, from theuser device, a set of keywords associated with the updated encrypteddocument and a document identifier uniquely identifying the updatedencrypted document. For each keyword in the set of keywords associatedwith the updated encrypted document, the operations may includeincrementing a keyword count associated with the keyword in a countsOKVS. The counts OKVS includes a plurality of keyword counts. Eachkeyword count indicates a number of the encrypted documents within thecorpus of encrypted documents that a respective keyword appears in. Theoperations may also include inserting a concatenation of the keyword anda respective keyword identifier associated with the document identifierinto the document OKVS and updating a deletion status of the associatedconcatenation in the deletion OKVS to indicate that the keyword is notdeleted from the associated encrypted document.

The operations optionally further include, for an existing encrypteddocument in the corpus of encrypted documents stored on the untrustedstorage device, receiving, from the user device, a deletion requestincluding a set of keywords to be deleted from the existing encrypteddocument and a document identifier uniquely identifying the existingencrypted document. For each keyword in the set of keywords to bedeleted from the existing encrypted document, the operations includeupdating a deletion status associated with the keyword concatenated withthe respective document identifier in the deletion OKVS to indicate thatthe keyword is deleted from the associated encrypted document uniquelyidentified by the respective document identifier.

The details of one or more implementations of the disclosure are setforth in the accompanying drawings and the description below. Otheraspects, features, and advantages will be apparent from the descriptionand drawings, and from the claims.

DESCRIPTION OF DRAWINGS

FIG. 1 is a schematic view of an example system for providingresponse-hiding searchable encryption.

FIG. 2 is a schematic view of an example system for retrieving documentsthat include a queried keyword.

FIG. 3 is a schematic view of an example system for adding an encrypteddocument to a corpus of documents to be searched.

FIG. 4 is a schematic view a list generator for generating a list ofdocument identifiers for the system of FIG. 1.

FIG. 5 is a schematic view of an example system for deleting keywordsfrom an encrypted document.

FIG. 6 is a flowchart of an example method for providing response-hidingsearchable encryption.

FIG. 7 is a schematic view of an example computing device that may beused to implement the systems and methods described herein.

Like reference symbols in the various drawings indicate like elements.

DETAILED DESCRIPTION

Searchable encryption (which may also be referred to as encryptedsearch) has been well studied for more than a decade. The goal ofsearchable encryption is to enable a client to outsource the storage ofa corpus of encrypted documents to an untrusted server. For example, theclient may wish to store a large number of documents securely in acloud-based storage solution. Generally, the client will desire to bothmaintain the ability to efficiently search the documents (i.e., searchfor a specific keyword), while simultaneously maintaining the privacyand security of the documents. In order to maintain this privacy,information related to the contents of the documents or the queries fromthe client must remain hidden from the server. Currently, the only knownway to maintain perfect privacy is by using computationally expensivecryptographic primitives such as fully homomorphic encryption. The largeperformance overheads of these primitives preclude them from being usedin practical applications. Instead, most current implementations insteadsacrifice a portion of privacy with the aim of improving efficiency.

The privacy of searchable encryption schemes is parameterized by aleakage function. The leakage function is an upper bound on theinformation revealed to the untrusted server when processing queriesover the stored documents. Therefore, it is advantageous for asearchable encryption scheme to minimize the leakage function whilemaximizing the efficiency of searches. Modern solutions offer a queryprocessing overhead that scales linearly with the number of matchingdocuments with very small hidden constants. Some techniques offerdynamic schemes that also enable inserting new documents and/ormodifying existing documents. These systems typically offer an overheadthat linearly scales with the number of updated keywords. However, thesetechniques are all response-revealing. That is, the leakages of theseschemes contain the identifiers of matching documents for each query(which may also be known as access pattern leakage).

Using various and continuously improving frequency analysis andstatistical learning methods, the contents of documents and/or thequeried keywords may be compromised by using exclusively access patternleakage. For example, some attacks are based on schemes that enableclients to perform range queries. In another scenario, adversaries mayinject files into an encrypted search scheme. By carefully arrangingkeywords in the injected files, adversaries viewing the identifiers ofmatching injected documents of any query may determine the queriedkeyword with perfect accuracy. A common denominator among these schemesis that each is response-revealing. Thus, it is advantageous to protectagainst current and future improvements of these attacks. An effectivemeans of mitigating the risk of these attacks is to employresponse-hiding searchable encryption schemes as the previouslydescribed attacks critically rely on the fact that leakage of queriesincludes the identifiers of matching documents. However, currentresponse-hiding schemes incur significantly larger overhead compared totheir response-revealing counterparts. In particular, response-hidingschemes perform at least logarithmic server operations for each queryresponse. Additionally, dynamic variants of response-hiding schemesincur at least logarithmic server computation for each modified keyword.Response-hiding is defined as an encryption scheme where the queryleakage does not reveal the identifiers of matching documents.

Implementations herein are directed toward an asymptotically optimal,dynamic, response-hiding searchable encryption manager that implementsdynamic searchable encryption by using oblivious random access memory(ORAM) in a blackbox manner with constant efficiency in terms of ORAMoperations. The manager's leakage consists only of the number ofmatching documents for queries and the number of unique keywords ininserted documents. ORAM enables a client to access a server-storedarray without revealing either the array contents or the indices updatedor retrieved. That is, an ORAM hides its access pattern by ensuringthat, for every input, the memory locations accessed, are similarlydistributed ORAMs come in many different implementations andefficiencies. For example, there is path ORAMs, square root ORAMs, andtree-based ORAMs. As the searchable encryption manager implements ORAMin a blackbox manner, the manager is compatible with any type of ORAM.Thus, the manager will inherently gain any benefits from improving ORAMtechnologies. The manager enables a user to query for keywords among acorpus of encrypted documents and retrieve, for example, a list ofidentifiers containing the queried keyword, metadata regarding thekeyword, related topics of identified documents, portions of thedocument text, or even the entire document text. The manager furtherenables the user update the encrypted documents (e.g., add, delete, andmodify the documents) while minimizing the information that the serverlearns about stored documents and queried keywords.

Referring to FIG. 1, in some implementations, an example system 100includes a user device 10 associated with a respective user or client 12and in communication with a remote system 111 via a network 112. Theuser device 10 may correspond to any computing device, such as a desktopworkstation, a laptop workstation, or a mobile device (i.e., a smartphone). The remote system 111 may be a single computer, multiplecomputers, or a distributed system (e.g., a cloud environment) havingscalable/elastic computing resources 118 (e.g., data processinghardware) and/or storage resources 116 (e.g., memory hardware). Adocument data store 150 is overlain on the storage resources 116 toallow scalable use of the storage resources 116 by one or more of theclient or computing resources 118. The document data store 150 isconfigured to store a corpus of encrypted documents 152, 152 a-n. Eachdocument 152 includes a document identifier 154 that uniquely identifiesthe associated document 152 (e.g., a document name). Each document 152also includes a set of keywords 32. The set of keywords 32 includes allkeywords that appear in the associated encrypted document 152. As usedherein, a document 152 may refer to any encrypted item uploaded onto theremote system 111 for storage within the document data store 150, suchas, without limitation, emails, calendar events, notes, databaseentries, etc. In some examples, the remote system 111 executes aSearchable Encryption (SE) manager 120 for managing access to theencrypted documents 152 within the data storage 150.

The SE manager 120, in some examples, receives a search query 30 fromthe user device 10 via the network 112. The search query 30 includes oneor more keywords 32 that the user 12 is searching for within one or moreof the encrypted documents 152 stored in the untrusted document datastore 150. For example, the user 12 may wish to determine which, if any,encrypted documents 152 include the specific keyword 32 “cat”. Inresponse to the search query 30, the SE manager 120 returns, in someexamples, a list of document identifiers (IDs) 154 that each uniquelyidentify a respective one of the encrypted document 152 that contain thekeyword(s) 32 included in the search query 30 (e.g., the list 40 ofdocument identifiers that contain the keyword 32 “cat”). In otherexamples, the SE manager 120 returns other relevant information, such asmetadata regarding the keyword, related topics of identified documents,portions of the document text, or even the entire document text. If thekeyword is associated with a definition (e.g., the document is adictionary), the definition may also be returned.

To fulfil the user's search query 30, the SE manager 120 accesses adocument oblivious key-value storage (OKVS) 160. An OKVS, like ORAM,conceals client 12 access patterns to data within the OKVS 160. An OKVSmaintains a key-value map 170 where each key 161 is uniquely associatedto a value 162. The oblivious nature of the OKVS ensures that the OKVSonly leaks (i.e., exposes) the number of operations performed and themaximum capacity of unique keys to an adversary. That is, an adversarymonitoring the accesses to the OKVS 160 cannot determine the values 162read from or written to the OKVS 160. The document OKVS 160 includes anarray or list of keys 161, and each key 161 is associated with one ormore values 162. In some examples, each key 161 of the document OKVS 160includes a keyword 32 (i.e., one of the keywords 32 of the encrypteddocuments 152) concatenated with a keyword identifier 164. The key 161(i.e., the keyword 32 concatenated with the keyword ID 164) may beassociated with a value 162 that includes a document identifier 154.That is, each keyword 32 concatenated with the keyword ID 164 isassociated with a document ID 154 that the keyword 32 appears in.

In the example shown, the key 161 “cat” (i.e., the keyword 32 “cat” isconcatenated with the keyword ID 164 “1”) is associated with thedocument ID 154 “doc52.” That is, the keyword 32 “cat” appears in theencrypted document 152 associated with the document ID 154 “doc52”.Similarly, the key 161 “cat2” (i.e., the keyword 32 “cat” isconcatenated with the keyword ID 164 “2”) is associated with the value“doc147” (i.e., the document ID 154 “doc147”) and the key 161 “catn” isassociated with the value 162 “docn”. The SE manager 120, afterreceiving the search query 30 containing a keyword 32, accesses thedocument OKVS 160 to obtain a list 40 of document IDs 154 that containthe queried keyword 32. For example, if the query 30 includes thekeyword 32 “cat”, the SE manager 120 may access the document OKVS 160 toobtain each value 162 associated with the keyword 32 “cat”. Each keywordID 164 may include a unique numerical indicator for that keyword 32(e.g., ‘1’, ‘2’, ‘3’, etc.). The SE manager 120 may repeatedly accessthe document OKVS 160 with a key 161 of the keyword 32 (e.g., “cat”) andan incrementing keyword ID 164. That is, the SE manager 120 may accessthe value 162 associated with the key 161 “cat1”, and then access thevalue 162 associated with the key 161 “cat2”, then “cat3”, “cat4”, andso on and so forth until all of the keys 161 with the respective keyword32 have been accessed. The SE manager 120 may add the document II) 154(i.e., the value 162) obtained for each accessed key 161 to the list 40of document IDs. Once complete (i.e., all keys 161 associated with thekeyword 32 have been accessed), the SE manager 120 may return the list40 to the user device 10. In some examples, the keyword ID 164 indicatesa creation date of the document identifier 154 relative to creationdates of the other document identifiers 154 in the list 40 of documentidentifiers 154. For example, when an encrypted document 152 isuploaded, the SE manager 120 may assign a keyword ID 164 that is greaterthan any previously assigned keyword IDs 164. Thus, the greater thekeyword ID 164, the later in time (relative to the other documents 152)a respective document 152 was uploaded. In some examples, the creationdate of the document identifier 154 refers to a last update to theencrypted document 152 uniquely identified by the document identifier154.

Referring now to FIG. 2, in some implementations, the SE manager 120,after receiving the search query 30 with keyword 32, sends the list 40of document IDs to a document retriever 210. The document retriever 210retrieves the encrypted documents 152 from the data store 150 that areuniquely identified by the document identifiers 154 in the list 40. Thedocument retriever 210 may then return the retrieved encrypted documents152 to the user device 10. In some examples, the SE manager 120 receivesa read request 230 from the user device 10 that includes one or moredocument IDs 154 from the list 40 of document IDs returned by the SEmanager 120 in response to a query 30. In this case, the documentretriever 210 receives the read request 230 from the SE manager 120 andsimilarly returns the encrypted documents 152 to the user device 10. Theuser device 10 may be configured to decrypt the retrieved documents 152.For example, the user device 10 may have access to private keys (e.g.,client-side keys) to decrypt the documents 152. Accordingly, encrypteddocuments 152 only containing keywords 32 queried for by the user 12 maybe returned to the user 12 (e.g., via the user device 10) withoutperforming any decryption operations on the encrypted documents 152stored in the untrusted data store 150.

Referring now to FIG. 3, in some implementations, the SE manager 120receives a new document identifier 154N and a set 321 of keywords 32associated with a new encrypted documented 152N uploaded by the user 12(via the user device 10) to document data store 150. The set 321 ofkeywords 32 represents all of the keywords 32 within the document 152N.In some examples, the SE manager 120 determines if the new documentidentifier 154N already exists in an identifier OKVS 350. The identifierOKVS 350 includes a set of document identifiers 154. Each of thedocument identifiers 154 uniquely identifies an associated encrypteddocument 152 stored in the document data store 150 such that theidentifier OKVS 350 tracks the identifier 154 uniquely identifying eachencrypted document 152 uploaded by the user 12. That is, each key 351 ofthe identifier OKVS 350 includes a respective document identifier 154and corresponding value 352. In some examples, the value 352 is aconstant (e.g., ‘1’) to indicate that the corresponding documentidentifier 154 exists. In the example shown, the identifier OKVS 350includes keys 351 (i.e., document IDs 154) “doc52”, “doc147”, . . . ,and “docn”, each associated with a value 352 of ‘2’. The SE manager 120may access the identifier OKVS 350 to ensure that the new documentidentifier 154N does not uniquely identify an already existing encrypteddocument 152 stored on the document data store 150. When the newdocument identifier 154N does already exist (i.e., the new documentidentifier 154N is not unique), the identifier 154N and keywords 32 maybe discarded and/or operations may terminate. In some examples, thesystem 100 may refuse to add the new encrypted document 152N. Prior totermination, the system 100 may perform fake operations (e.g., no-ops)in order to disguise whether the inserted document was valid or not.That is, the system 100 may perform a series of operations of a similarlength when a document is inserted so that an adversary cannot easilytell if the operation was successful.

When the new document identifier 154N does not exist in the identifierOKVS 350, the SE manager 120 may update the identifier OKVS 350 with thenew document identifier 154N to uniquely identifying the new encrypteddocument 152N. For each keyword 32 in the set 321 associated with thenew encrypted document 152N, the SE manager 120, in someimplementations, increments a keyword count 364 associated with thekeyword 32 in a counts OKVS 360. The counts OKVS 360, in some examples,includes a keyword count 364 for each keyword 32 appearing in at leastone of the encrypted documents 152. Specifically, each keyword count 364indicates a number of the encrypted documents 152 that a respectivekeyword 32 appears in. That is, if the keyword 32 “cat” appears in tenseparate encrypted documents 152 (irrespective of the number of timesthe keyword 32 appears within the same document 152), the keyword count364 associated with the keyword 32 “cat” will equal ten. In this way,the counts OKVS 360 tracks the number of encrypted documents 152 thateach keyword 32 appears in. For example, the counts OKVS 360 includes akey 361 that represents a keyword 32 (e.g., “cat”) and each key 361 isassociated with a value 362 that represents the keyword count 364 (e.g.,“10”) of the keyword 32.

With continued reference to FIG. 3, the SE manager 120 increments thekeyword count 364 associated with each keyword 32. For example, if theprevious keyword count 364 for the keyword 32 “cat” was “10”, then theSE manager 120 may increment the keyword count 364 to “11”. That is, theSE manager 120, in some examples, increments the keyword count 364 byone when the keyword count 364 is greater than or equal to one. When thekeyword count 364 is not greater than one (e.g., the keyword count 364is zero, null, etc.), the SE manager 120 may set the keyword count 364to one. The updated or incremented keyword count 364 reflects theincrease in the appearance of the keyword 32 in the encrypted documented152 due to the new encrypted document 152N.

In some examples, the SE manager 120, prior to accessing the documentOKVS 160 after receiving a search query 30, accesses the counts OKVS 360to determine a number of the encrypted documents 152 that the keyword 32specified in the search query 30 appears in. This allows for increasedefficiency when the SE manager 120 accesses the document OKVS 160 as theSE manager 120 is aware of how many document identifiers 154 the SEmanager 120 needs to retrieve. For example, if the queried keyword 32 is“cat”, and the counts OKVS 360 indicates that “cat” appears in tenencrypted documents 152, the SE manager 120 knows it can stop accessingthe document OKVS 160 after ten document identifiers 154 have beenretrieved. Without this knowledge, the SE manager 120 may be forced tocontinue to access and search the entire document OKVS 160 to ensure alldocument identifiers 154 have been retrieved, because, in some examples,deleted documents 152 could lead to missing keyword IDs 164. Forexample, if the keyword 32 “cat” was removed from the document 152associated with the “cat2” key 161, the SE manager 120 would be unawareif the document 152 associated with the “cat1” key 161 was the finaldocument identifier 154 or if a document/keyword was removed, andtherefore the SE manager 120 may continue searching.

In some implementations, the SE manager 120 inserts a concatenation ofthe keyword 32 and a respective keyword identifier 164 with the newdocument identifier 154N into the document OKVS 160. That is, in theexample shown, the incremented keyword count 364 of “11” is concatenatedwith the keyword 32 (i.e., “cat11”) and assigned the value 162 of“doc531.” Thus, future search queries 30 for keywords 32 included in thenew encrypted document 152N (e.g., “cat”) will return the documentidentifier 154 associated with the new encrypted document 152N (e.g.,“doc531”).

Referring now to FIG. 4, optionally, the search query 30 includes aquery count 420. Because an adversary monitoring the system 100 may beable to obtain the number of document identifiers 154 that the SEmanager 120 returns, it may be advantageous to conceal the actual numberof document identifiers 154 returned. The query count 420 may specify anumber of document identifiers 154 to obtain from the document OKVS 160.In some implementations, the SE manager 120 includes a list generator410 that receives the search query 30 including the keyword 32 and thequery count 420. The list generator 410 obtains the document identifiers154 from the document OKVS 160 that uniquely identify encrypteddocuments 152 that include the keyword 32. The list generator 410 maylimit the number of document identifiers 154 added to the list 40 to thenumber specified by the query count 420. For example, if there are tenencrypted documents 152 that include the keyword 32 (i.e., the documentID count equals ten) and the query count 420 is set to five, the listgenerator may return only the first five document identifiers 154 in thelist 40. The list generator 410, in some examples, returns othercombinations of document identifiers 154 (e.g., the last five) in anyorder and/or performs random shuffling.

Alternatively, if the query count 420 is larger and there are not enoughdocument identifiers 154 to fulfill the query count 420, the listgenerator 410 may append one or more dummy document identifiers 430 forreturn to the user device 10. That is, the list generator 410 may appendthe dummy document identifiers 430 until the query count 420 issatisfied. For example, if the query count 420 is set to fifteen whilethere are only ten document identifiers 154 that are associated with thekeyword 32, then the list generator 410 may append five dummy documentidentifiers 430 to the list 40 in order to satisfy the query count 420.

Referring now to FIG. 5, in some implementations, the SE manager 120accesses a deletion OKVS 510 to identify one or more documentidentifiers 154 associated with a deletion of a keyword 32. In someinstances, an encrypted document 152 may be edited or modified to removea keyword 32. For example, the user 12 may remove the keyword 32 “cat”from “doc52”. The deletion OKVS 510 includes a key 511 of a keyword 32concatenated with a document identifier 154 and a value 512 of adeletion flag 514. That is, each key 511 (i.e., the keyword 32concatenated with the document ID 154) uniquely identifies a respectiveencrypted document 152 in which the keyword 32 is deleted. The SEmanager 120, in some examples, excludes any document identifiers 154from the list 40 that the deletion OKVS 510 indicates has the keyword 32deleted. Returning to the previous example, the deletion OKVS 510 mayinclude a key 511 (i.e., the keyword 32 concatenated with the documentidentifier 154) of “cat-doc152” with a value 512 associated with thedeletion flag 514 of ‘1’ The deletion flag 514 has a value to indicate akeyword 32 has been deleted and a value to indicate the keyword 32 hasnot deleted (e.g., a Boolean value). For example, ‘1’, ‘true’, etc, mayindicate that the keyword 32 has been deleted while ‘0’, false, etc.,may indicate that the keyword 32 has not been deleted. Thus, the keyword32 “cat” has been deleted (i.e., the deletion flag 514 is equal to ‘1’)from the encrypted document 152 associated with the document identifier154 “doc52”. Similarly, the keyword 32 “dog” has been deleted from theencrypted document 152 associated with the document identifier 154“doc31.” The keyword 32 “cat” has not been deleted (i.e., the deletionflag 514 is equal to ‘0’) from the encrypted document 152 associatedwith the document identifier 154 “doc89”. That is, the user 12 may havedeleted “cat” from “doc89” and then subsequently re-added “cat” back tothe document 152 (“doc89”), or optionally, “cat” is one of the keywords32 in a newly uploaded encrypted document 152.

In some implementations, the SE manager 120 receives a set 532 ofkeywords 32 associated with a document identifier 154 of an updatedencrypted document 152U. The updated encrypted document 1521 may be anentirely new encrypted document 152 or a modification to an existingencrypted document 152. For each keyword 32 in the set 532 associatedwith the updated encrypted document 152U, the SE manager 120, in someimplementations, and as described previously, increments a keyword count364 associated with the keyword 32 in the counts OKVS 360. The SEmanager 120 may also insert a concatenation of the keyword 32 and arespective keyword identifier 164 with the updated document identifier154 into the document OKVS 160. Optionally, the SE manager 120 updatesthe deletion status/flag 514 of the associated concatenation 32, 154 toindicate that the keyword 32 is not deleted from the associatedencrypted document 152.

With continued reference to FIG. 5, the SE manager 120, in someimplementations, receives from the user device 10, a deletion request540 that includes a set 534 of keywords 32 to be deleted from anexisting encrypted document 152. The deletion request 540 also includesa document identifier 154 to uniquely identify the existing encrypteddocument 152. For each keyword 32 in the set 534, the SE manager 120 mayupdate the deletion status 514 of the associated concatenation of thekeyword 32 and document identifier 154 to indicate that the keyword 32is deleted from the associated encrypted document 152.

Thus, the system 100 may include three or more independent OKVSs (thedocument OKVS 160, the counts OKVS 360, the identifier OKVS 350, and thedeletion OKVS 510). However, in some implementations, all or somecombination of these OKVSs are combined into a single OKVS. The SEmanager 120 may initialize each OKVS with a capacity to store any numberof unique keys. This capacity bounds the maximum number ofdocument-keyword pairs. Each OKVS only leaks the number of operationsperformed and the maximum capacity of unique keys and maintains an O(lgn) efficiency. That is, all an adversary may learn when observing thesystem 100 is the number of document identifiers returned (which, asdiscussed previously, may be obfuscated through the query count) and themaximum size of newly uploaded documents. The system is forward secureas no information about future inserted documents is leaked. The system100 may be further response-hiding by, for example, encrypting thelargest keyword count over all of the keywords and, when performingqueries to the document OKVS 160, performing as many queries as thelargest keyword count, as this ensures that an adversary cannotdetermine whether the queried keyword exists or not.

FIG. 6 is a flowchart of an example method 600 for providingresponse-hiding searchable encryption. The flowchart starts at operation602 with receiving, at data processing hardware 118, a search query 30for a keyword 32 from a user device 10 associated with a user 12, thekeyword 32 appearing in one or more encrypted documents 152 within acorpus of encrypted documents stored on an untrusted storage device 150.At operation 604, the method 600 also includes accessing, by the dataprocessing hardware 118, a document oblivious key-value storage (OKVS)160 to obtain a list of document identifiers 154 associated with thekeyword 32, each document identifier 154 in the list 40 of documentidentifiers 154 associated with a respective keyword identifier 164concatenated with the keyword 32 and uniquely identifying a respectiveone of the one or more encrypted documents 152 that the keyword 32appears in. At operation 606, the method 600 also includes returning, bythe data processing hardware 118, the list of document identifiers 154obtained from the document OKVS 160 to the user device 10.

FIG. 7 is schematic view of an example computing device 700 that may beused to implement the systems and methods described in this document.The computing device 700 is intended to represent various forms ofdigital computers, such as laptops, desktops, workstations, personaldigital assistants, servers, blade servers, mainframes, and otherappropriate computers. The components shown here, their connections andrelationships, and their functions, are meant to be exemplary only, andare not meant to limit implementations of the inventions describedand/or claimed in this document.

The computing device 700 includes a processor 710, memory 720, a storagedevice 730, a high-speed interface/controller 740 connecting to thememory 720 and high-speed expansion ports 750, and a low speedinterface/controller 760 connecting to a low speed bus 770 and a storagedevice 730. Each of the components 710, 720, 730, 740, 750, and 760, areinterconnected using various busses, and may be mounted on a commonmotherboard or in other manners as appropriate. The processor 710 canprocess instructions for execution within the computing device 700,including instructions stored in the memory 720 or on the storage device730 to display graphical information for a graphical user interface(GUI) on an external input/output device, such as display 780 coupled tohigh speed interface 740. In other implementations, multiple processorsand/or multiple buses may be used, as appropriate, along with multiplememories and types of memory. Also, multiple computing devices 700 maybe connected, with each device providing portions of the necessaryoperations (e.g., as a server bank, a group of blade servers, or amulti-processor system).

The memory 720 stores information non-transitorily within the computingdevice 700. The memory 720 may be a computer-readable medium, a volatilememory unit(s), or non-volatile memory unit(s). The non-transitorymemory 720 may be physical devices used to store programs (e.g.,sequences of instructions) or data (e.g., program state information) ona temporary or permanent basis for use by the computing device 700.Examples of non-volatile memory include, but are not limited to, flashmemory and read-only memory (ROM)/programmable read-only memory(PROM)/erasable programmable read-only memory (EPROM)/electronicallyerasable programmable read-only memory (EEPROM) (e.g., typically usedfor firmware, such as boot programs). Examples of volatile memoryinclude, but are not limited to, random access memory (RAM), dynamicrandom access memory (DRAM), static random access memory (SRAM), phasechange memory (PCM) as well as disks or tapes.

The storage device 730 is capable of providing mass storage for thecomputing device 700. In some implementations, the storage device 730 isa computer-readable medium. In various different implementations, thestorage device 730 may be a floppy disk device, a hard disk device, anoptical disk device, or a tape device, a flash memory or other similarsolid state memory device, or an array of devices, including devices ina storage area network or other configurations. In additionalimplementations, a computer program product is tangibly embodied in aninformation carrier. The computer program product contains instructionsthat, when executed, perform one or more methods, such as thosedescribed above. The information carrier is a computer- ormachine-readable medium, such as the memory 720, the storage device 730,or memory on processor 710.

The high speed controller 740 manages bandwidth-intensive operations forthe computing device 700, while the low speed controller 760 manageslower bandwidth-intensive operations. Such allocation of duties isexemplary only. In some implementations, the high-speed controller 740is coupled to the memory 720, the display 780 (e.g., through a graphicsprocessor or accelerator), and to the high-speed expansion ports 750,which may accept various expansion cards (not shown). In someimplementations, the low-speed controller 760 is coupled to the storagedevice 730 and a low-speed expansion port 790. The low-speed expansionport 790, which may include various communication ports (e.g., USB,Bluetooth, Ethernet, wireless Ethernet), may be coupled to one or moreinput/output devices, such as a keyboard, a pointing device, a scanner,or a networking device such as a switch or router, e.g., through anetwork adapter.

The computing device 700 may be implemented in a number of differentforms, as shown in the figure. For example, it may be implemented as astandard server 700 a or multiple times in a group of such servers 700a, as a laptop computer 700 b, or as part of a rack server system 700 c.

Various implementations of the systems and techniques described hereincan be realized in digital electronic and/or optical circuitry,integrated circuitry, specially designed ASICs (application specificintegrated circuits), computer hardware, firmware, software, and/orcombinations thereof. These various implementations can includeimplementation in one or more computer programs that are executableand/or interpretable on a programmable system including at least oneprogrammable processor, which may be special or general purpose, coupledto receive data and instructions from, and to transmit data andinstructions to, a storage system, at least one input device, and atleast one output device.

A software application (i.e., a software resource) may refer to computersoftware that causes a computing device to perform a task. In someexamples, a software application may be referred to as an “application,”an “app,” or a “program.” Example applications include, but are notlimited to, system diagnostic applications, system managementapplications, system maintenance applications, word processingapplications, spreadsheet applications, messaging applications, mediastreaming applications, social networking applications, and gamingapplications.

These computer programs (also known as programs, software, softwareapplications or code) include machine instructions for a programmableprocessor, and can be implemented in a high-level procedural and/orobject-oriented programming language, and/or in assembly/machinelanguage. As used herein, the terms “machine-readable medium” and“computer-readable medium” refer to any computer program product,non-transitory computer readable medium, apparatus and/or device (e.g.,magnetic discs, optical disks, memory, Programmable Logic Devices(PLDs)) used to provide machine instructions and/or data to aprogrammable processor, including a machine-readable medium thatreceives machine instructions as a machine-readable signal. The term“machine-readable signal” refers to any signal used to provide machineinstructions and/or data to a programmable processor.

The processes and logic flows described in this specification can beperformed by one or more programmable processors, also referred to asdata processing hardware, executing one or more computer programs toperform functions by operating on input data and generating output. Theprocesses and logic flows can also be performed by special purpose logiccircuitry, e.g., an FPGA (field programmable gate array) or an ASIC(application specific integrated circuit). Processors suitable for theexecution of a computer program include, by way of example, both generaland special purpose microprocessors, and any one or more processors ofany kind of digital computer. Generally, a processor will receiveinstructions and data from a read only memory or a random access memoryor both. The essential elements of a computer are a processor forperforming instructions and one or more memory devices for storinginstructions and data. Generally, a computer will also include, or beoperatively coupled to receive data from or transfer data to, or both,one or more mass storage devices for storing data, e.g., magnetic,magneto optical disks, or optical disks. However, a computer need nothave such devices. Computer readable media suitable for storing computerprogram instructions and data include all forms of non-volatile memory,media and memory devices, including by way of example semiconductormemory devices, e.g. EPROM, EEPROM, and flash memory devices; magneticdisks, e.g., internal hard disks or removable disks, magneto opticaldisks; and CD ROM and DVD-ROM disks. The processor and the memory can besupplemented by, or incorporated in, special purpose logic circuitry.

To provide for interaction with a user, one or more aspects of thedisclosure can be implemented on a computer having a display device,e.g., a CRT (cathode ray tube), LCD (liquid crystal display) monitor, ortouch screen for displaying information to the user and optionally akeyboard and a pointing device, e.g., a mouse or a trackball, by whichthe user can provide input to the computer. Other kinds of devices canbe used to provide interaction with a user as well; for example,feedback provided to the user can be any form of sensory feedback, e.g.,visual feedback, auditory feedback, or tactile feedback, and input fromthe user can be received in any form, including acoustic, speech, ortactile input. In addition, a computer can interact with a user bysending documents to and receiving documents from a device that is usedby the user, for example, by sending web pages to a web browser on auser's client device in response to requests received from the webbrowser.

A number of implementations have been described. Nevertheless, it willbe understood that various modifications may be made without departingfrom the spirit and scope of the disclosure. Accordingly, otherimplementations are within the scope of the following claims.

What is claimed is:
 1. A method comprising: receiving, at dataprocessing hardware, a search query for a keyword from a user deviceassociated with a user, the keyword appearing in one or more encrypteddocuments within a corpus of encrypted documents stored on an untrustedstorage device; accessing, by the data processing hardware, a documentoblivious key-value storage (OKVS) to obtain a list of documentidentifiers associated with the keyword, each document identifier in thelist of document identifiers associated with a respective keywordidentifier concatenated with the keyword and uniquely identifying arespective one of the one or more encrypted documents that the keywordappears in; and returning, by the data processing hardware, the list ofdocument identifiers obtained from the document OKVS to the user device.2. The method of claim 1, further comprising receiving, at the dataprocessing hardware from the user device, a read request including oneor more document identifiers from the returned list of documentidentifiers; and for each document identifier received in the readrequest: retrieving, by the data processing hardware, the respective oneof the one or more encrypted documents that the keyword appears in fromthe untrusted storage device; and returning, by the data processinghardware, the retrieved respective one of the one or more encrypteddocuments that the keyword appears in to the user device, the userdevice configured to decrypt the retrieved respective one of the one ormore encrypted documents.
 3. The method of claim 1, further comprising,for a new encrypted document uploaded by the user into the corpus ofencrypted documents stored on the untrusted storage device: receiving,at the data processing hardware from the user device, a set of keywordsassociated with the new encrypted document and a new document identifieruniquely identifying the new encrypted document; determining, by thedata processing hardware, whether the new document identifier exists inan identifier OKVS, the identifier OKVS comprising a set of documentidentifiers, each document identifier in the set of document identifiersuniquely identifying a respective one of the encrypted documents withinthe corpus of encrypted documents stored on the untrusted storagedevice; when the new document identifier does not exist in theidentifier OKVS, updating, by the data processing hardware, theidentifier OKVS with the new document identifier uniquely identifyingthe new encrypted document; and for each keyword in the set of keywordsassociated with the new encrypted document: incrementing, by the dataprocessing hardware, a keyword count associated with the keyword in acounts OKVS, the counts OKVS comprising a plurality of keyword counts,each keyword count indicating a number of the encrypted documents withinthe corpus of encrypted documents that a respective keyword appears in;and inserting, by the data processing hardware, a concatenation of thekeyword and a respective keyword identifier associated with the newdocument identifier into the document OKVS.
 4. The method of claim 3,wherein incrementing the keyword count associated with the keyword inthe counts OKVS comprises: when the keyword count is greater than orequal to one, increasing the keyword count by one; and when the keywordcount is not greater than or equal to one, setting the keyword count toone.
 5. The method of claim 3, further comprising, when the new documentidentifier exists in the identifier OKVS, discarding, by the dataprocessing hardware, the new document identifier and the set of keywordsassociated with the new encrypted document.
 6. The method of claim 1,wherein: the search query for the keyword received from the user devicecomprises a query count, the query count specifying a number of documentidentifiers to obtain from the document OKVS; and accessing the documentOKVS to obtain the list of document identifiers comprises limiting anumber of document identifiers included in the list of documentidentifiers to the number specified by the query count.
 7. The method ofclaim 6, further comprising, when the number of document identifiersincluded in the list of document identifiers obtained from the documentOKVS is less than the number specified by the query count, appending, bythe data processing hardware, one or more dummy document identifiers tothe list of document identifiers for return to the user device.
 8. Themethod of claim 1, wherein the respective keyword identifier associatedwith each document identifier in the list of document identifiersobtained from the document OKVS comprises a unique numerical indicatorindicating a creation date of the document identifier relative tocreation dates of the other document identifiers in the list of documentidentifiers.
 9. The method of claim 1, further comprising, prior toaccessing the document OKVS, accessing, by the data processing hardware,a counts OKVS to determine a number of the one or more encrypteddocuments the keyword appears, the counts OKVS comprising a plurality ofkeyword counts, each keyword count indicating a number of the encrypteddocuments within the corpus of encrypted documents that a respectivekeyword appears in.
 10. The method of claim 1, further comprising:accessing, by the data processing hardware, a deletion OKVS to identifyone or more document identifiers associated with a deletion of thekeyword, each identified document identifier concatenated with thekeyword and uniquely identifying a respective one of the one or moreencrypted documents in which the keyword is deleted, wherein the list ofdocument identifiers obtained from the document OKVS excludes any of theone or more document identifiers identified from the deletion OKVS. 11.The method of claim 10, wherein the deletion OKVS comprises a set ofkeywords concatenated with document identifiers, each keyword in the setof keywords concatenated with a respective document identifier uniquelyidentifying a respective encrypted document within the corpus ofencrypted documents in which the keyword appears in or has been deletedfrom.
 12. The method of claim 11, further comprising, for an updatedencrypted document uploaded by the user into the corpus of encrypteddocuments stored on the untrusted storage device: receiving, at the dataprocessing hardware from the user device, a set of keywords associatedwith the updated encrypted document and a document identifier uniquelyidentifying the updated encrypted document; for each keyword in the setof keywords associated with the updated encrypted document:incrementing, by the data processing hardware, a keyword countassociated with the keyword in a counts OKVS, the counts OKVS comprisinga plurality of keyword counts, each keyword count indicating a number ofthe encrypted documents within the corpus of encrypted documents that arespective keyword appears in; inserting, by the data processinghardware, a concatenation of the keyword and a respective keywordidentifier associated with the document identifier into the documentOKVS; and updating, by the data processing hardware, a deletion statusof the associated concatenation in the deletion OKVS to indicate thatthe keyword is not deleted from the associated encrypted document. 13.The method of claim 11, further comprising, for an existing encrypteddocument in the corpus of encrypted documents stored on the untrustedstorage device: receiving, at the data processing hardware from the userdevice, a deletion request comprising a set of keywords to be deletedfrom the existing encrypted document and a respective documentidentifier uniquely identifying the existing encrypted document; and foreach keyword in the set of keywords to be deleted from the existingencrypted document, updating, by the data processing hardware, adeletion status associated with the keyword concatenated with therespective document identifier in the deletion OKVS to indicate that thekeyword is deleted from the existing encrypted document uniquelyidentified by the respective document identifier.
 14. A systemcomprising: data processing hardware; and memory hardware incommunication with the data processing hardware, the memory hardwarestoring instructions that when executed on the data processing hardwarecause the data processing hardware to perform operations comprising:receiving a search query for a keyword from a user device associatedwith a user, the keyword appearing in one or more encrypted documentswithin a corpus of encrypted documents stored on an untrusted storagedevice; accessing a document oblivious key-value storage (OKVS) toobtain a list of document identifiers associated with the keyword, eachdocument identifier in the list of document identifiers associated witha respective keyword identifier concatenated with the keyword anduniquely identifying a respective one of the one or more encrypteddocuments that the keyword appears in; and returning the list ofdocument identifiers obtained from the document OKVS to the user device.15. The system of claim 14, wherein the operations further comprise:receiving, from the user device, a read request including one or moredocument identifiers from the returned list of document identifiers; andfor each document identifier received in the read request: retrievingthe respective one of the one or more encrypted documents that thekeyword appears in from the untrusted storage device; and returning theretrieved respective one of the one or more encrypted documents that thekeyword appears in to the user device, the user device configured todecrypt the retrieved respective one of the one or more encrypteddocuments.
 16. The system of claim 14, wherein the operations furthercomprise, for a new encrypted document uploaded by the user into thecorpus of encrypted documents stored on the untrusted storage device:receiving, from the user device, a set of keywords associated with thenew encrypted document and a new document identifier uniquelyidentifying the new encrypted document; determining whether the newdocument identifier exists in an identifier OKVS, the identifier OKVScomprising a set of document identifiers, each document identifier inthe set of document identifiers uniquely identifying a respective one ofthe encrypted documents within the corpus of encrypted documents storedon the untrusted storage device; when the new document identifier doesnot exist in the identifier OKVS, updating the identifier OKVS with thenew document identifier uniquely identifying the new encrypted document;and for each keyword in the set of keywords associated with the newencrypted document: incrementing a keyword count associated with thekeyword in a counts OKVS, the counts OKVS comprising a plurality ofkeyword counts, each keyword count indicating a number of the encrypteddocuments within the corpus of encrypted documents that a respectivekeyword appears in; and inserting a concatenation of the keyword and arespective keyword identifier associated with the new documentidentifier into the document OKVS.
 17. The system of claim 16, whereinincrementing the keyword count associated with the keyword in the countsOKVS comprises: when the keyword count is greater than or equal to one,increasing the keyword count by one; and when the keyword count is notgreater than or equal to one, setting the keyword count to one.
 18. Thesystem of claim 16, wherein the operations further comprise, when thenew document identifier exists in the identifier OKVS, discarding thenew document identifier and the set of keywords associated with the newencrypted document.
 19. The system of claim 14, wherein: the searchquery for the keyword received from the user device comprises a querycount, the query count specifying a number of document identifiers toobtain from the document OKVS; and accessing the document OKVS to obtainthe list of document identifiers comprises limiting a number of documentidentifiers included in the list of document identifiers to the numberspecified by the query count.
 20. The system of claim 19, wherein theoperations further comprise, when the number of document identifiersincluded in the list of document identifiers obtained from the documentOKVS is less than the number specified by the query count, appending oneor more dummy document identifiers to the list of document identifiersfor return to the user device.
 21. The system of claim 14, wherein therespective keyword identifier associated with each document identifierin the list of document identifiers obtained from the document OKVScomprises a unique numerical indicator indicating a creation date of thedocument identifier relative to creation dates of the other documentidentifiers in the list of document identifiers.
 22. The system of claim14, wherein the operations further comprise, prior to accessing thedocument OKVS, accessing a counts OKVS to determine a number of the oneor more encrypted documents the keyword appears, the counts OKVScomprising a plurality of keyword counts, each keyword count indicatinga number of the encrypted documents within the corpus of encrypteddocuments that a respective keyword appears in.
 23. The system of claim14, wherein the operations further comprise: accessing a deletion OKVSto identify one or more document identifiers associated with a deletionof the keyword, each identified document identifier concatenated withthe keyword and uniquely identifying a respective one of the one or moreencrypted documents in which the keyword is deleted, wherein the list ofdocument identifiers obtained from the document OKVS excludes any of theone or more document identifiers identified from the deletion OKVS. 24.The system of claim 23, wherein the deletion OKVS comprises a set ofkeywords concatenated with document identifiers, each keyword in the setof keywords concatenated with a respective document identifier uniquelyidentifying a respective encrypted document within the corpus ofencrypted documents in which the keyword appears in or has been deletedfrom.
 25. The system of claim 24, wherein the operations furthercomprise, for an updated encrypted document uploaded by the user intothe corpus of encrypted documents stored on the untrusted storagedevice: receiving, from the user device, a set of keywords associatedwith the updated encrypted document and a document identifier uniquelyidentifying the updated encrypted document; for each keyword in the setof keywords associated with the updated encrypted document: incrementinga keyword count associated with the keyword in a counts OKVS, the countsOKVS comprising a plurality of keyword counts, each keyword countindicating a number of the encrypted documents within the corpus ofencrypted documents that a respective keyword appears in; inserting aconcatenation of the keyword and a respective keyword identifierassociated with the document identifier into the document OKVS; andupdating a deletion status of the associated concatenation in thedeletion OKVS to indicate that the keyword is not deleted from theassociated encrypted document.
 26. The system of claim 24, wherein theoperations further comprise, for an existing encrypted document in thecorpus of encrypted documents stored on the untrusted storage device:receiving, from the user device, a deletion request comprising a set ofkeywords to be deleted from the existing encrypted document and adocument identifier uniquely identifying the existing encrypteddocument; and for each keyword in the set of keywords to be deleted fromthe existing encrypted document, updating a deletion status associatedwith the keyword concatenated with the respective document identifier inthe deletion OKVS to indicate that the keyword is deleted from theexisting encrypted document uniquely identified by the respectivedocument identifier.